Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PeriodicReportingFetchV2Error to fetch metrics multiple times #40251

Merged
merged 12 commits into from
Jul 17, 2024

Conversation

zmoog
Copy link
Contributor

@zmoog zmoog commented Jul 15, 2024

Proposed commit message

Try to fetch CloudWatch metrics multiple times instead of just once.

CloudWatch metrics can take time to appear after creating a brand-new resource. So, the tests must fetch the metrics multiple times until they find the metric values or reach the time limit.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Related issues

CloudWatch metrics can take time to appear after a we create a brand
new resource.
@zmoog zmoog self-assigned this Jul 15, 2024
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jul 15, 2024
Copy link
Contributor

mergify bot commented Jul 15, 2024

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @zmoog? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v8./d.0 is the label to automatically backport to the 8./d branch. /d is the digit

@kaiyan-sheng kaiyan-sheng added the aws Enable builds in the CI for aws cloud testing label Jul 15, 2024
@zmoog zmoog added the Team:obs-ds-hosted-services Label for the Observability Hosted Services team label Jul 15, 2024
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jul 15, 2024
@zmoog zmoog added the bug label Jul 15, 2024
@zmoog zmoog force-pushed the zmoog/aws-metrics-flacky-tests branch from c887f60 to 5b39dc1 Compare July 15, 2024 22:46
@zmoog
Copy link
Contributor Author

zmoog commented Jul 16, 2024

/test

3 similar comments
@zmoog
Copy link
Contributor Author

zmoog commented Jul 16, 2024

/test

@zmoog
Copy link
Contributor Author

zmoog commented Jul 16, 2024

/test

@zmoog
Copy link
Contributor Author

zmoog commented Jul 16, 2024

/test

@zmoog zmoog changed the title [WIP] Retry fetching CloudWatch metrics a few times Add PeriodicReportingFetchV2Error to fetch metrics multiple times Jul 16, 2024
@zmoog zmoog marked this pull request as ready for review July 16, 2024 16:13
@zmoog zmoog requested review from a team as code owners July 16, 2024 16:13
@zmoog zmoog requested review from AndersonQ and belimawr July 16, 2024 16:13
@elasticmachine
Copy link
Collaborator

Pinging @elastic/obs-ds-hosted-services (Team:obs-ds-hosted-services)

@pierrehilbert pierrehilbert added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Jul 16, 2024
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@pierrehilbert pierrehilbert requested review from faec and VihasMakwana and removed request for AndersonQ and belimawr July 16, 2024 16:21
@@ -20,7 +21,7 @@ func TestFetch(t *testing.T) {
config := mtest.GetConfigForTest(t, "sqs", "300s")

metricSet := mbtest.NewReportingMetricSetV2Error(t, config)
events, errs := mbtest.ReportingFetchV2Error(metricSet)
events, errs := mbtest.PeriodicReportingFetchV2Error(metricSet, 1*time.Minute, 8*time.Minute)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we planning on replacing ReportingfetchV2Error with PeriodicReportingFetchV2Error in other metricsets?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to focus this PR on removing the flakiness from the test suite so that other teams can continue with their tasks.

Do you know of other tests that need more than one call to Fetch() to succeed due to latency or something similar?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if this solution proves effective, we can adopt it for more tests that need more than one fetch.

@pierrehilbert pierrehilbert merged commit 532133b into elastic:main Jul 17, 2024
28 checks passed
@zmoog zmoog deleted the zmoog/aws-metrics-flacky-tests branch July 17, 2024 15:58
@dliappis
Copy link
Contributor

@zmoog should we backport this to 8.15/8.14/7.17 ?

@zmoog
Copy link
Contributor Author

zmoog commented Jul 18, 2024

@zmoog should we backport this to 8.15/8.14/7.17 ?

Good point. Let me add the labels and take care of these backports.

@zmoog zmoog added backport-7.17 Automated backport to the 7.17 branch with mergify backport-v8.14.0 Automated backport with mergify backport-8.15 Automated backport to the 8.15 branch with mergify labels Jul 18, 2024
mergify bot pushed a commit that referenced this pull request Jul 18, 2024
…0251)

* Retry fetching CloudWatch metrics a few times

CloudWatch metrics can take time to appear after a we create a brand
new resource.

* fix wrong import

* Fix events

* Retry 5 times every 60s

* Switch to PeriodicReportingFetchV2Error

* Fix linter complaints

* Remove unused code

* Remove unused code

* Address linter complaints

* Address linter complaints

* Improve function docs

* Cleanup

(cherry picked from commit 532133b)
mergify bot pushed a commit that referenced this pull request Jul 18, 2024
…0251)

* Retry fetching CloudWatch metrics a few times

CloudWatch metrics can take time to appear after a we create a brand
new resource.

* fix wrong import

* Fix events

* Retry 5 times every 60s

* Switch to PeriodicReportingFetchV2Error

* Fix linter complaints

* Remove unused code

* Remove unused code

* Address linter complaints

* Address linter complaints

* Improve function docs

* Cleanup

(cherry picked from commit 532133b)
mergify bot pushed a commit that referenced this pull request Jul 18, 2024
…0251)

* Retry fetching CloudWatch metrics a few times

CloudWatch metrics can take time to appear after a we create a brand
new resource.

* fix wrong import

* Fix events

* Retry 5 times every 60s

* Switch to PeriodicReportingFetchV2Error

* Fix linter complaints

* Remove unused code

* Remove unused code

* Address linter complaints

* Address linter complaints

* Improve function docs

* Cleanup

(cherry picked from commit 532133b)

# Conflicts:
#	metricbeat/mb/testing/modules.go
zmoog added a commit that referenced this pull request Jul 18, 2024
…trics multiple times (#40295)

* Add PeriodicReportingFetchV2Error to fetch metrics multiple times (#40251)

* Retry fetching CloudWatch metrics a few times

CloudWatch metrics can take time to appear after a we create a brand
new resource.

* fix wrong import

* Fix events

* Retry 5 times every 60s

* Switch to PeriodicReportingFetchV2Error

* Fix linter complaints

* Remove unused code

* Remove unused code

* Address linter complaints

* Address linter complaints

* Improve function docs

* Cleanup

(cherry picked from commit 532133b)

# Conflicts:
#	metricbeat/mb/testing/modules.go

* Resolve conflicts

---------

Co-authored-by: Maurizio Branca <[email protected]>
Co-authored-by: Maurizio Branca <[email protected]>
zmoog pushed a commit that referenced this pull request Jul 18, 2024
…0251) (#40297)

* Add PeriodicReportingFetchV2Error to fetch metrics multiple times

CloudWatch metrics can take time to appear after we create a brand
new resource.
zmoog added a commit that referenced this pull request Jul 18, 2024
…0251) (#40296)

* Retry fetching CloudWatch metrics a few times

CloudWatch metrics can take time to appear after a we create a brand
new resource.

* fix wrong import

* Fix events

* Retry 5 times every 60s

* Switch to PeriodicReportingFetchV2Error

* Fix linter complaints

* Remove unused code

* Remove unused code

* Address linter complaints

* Address linter complaints

* Improve function docs

* Cleanup

(cherry picked from commit 532133b)

Co-authored-by: Maurizio Branca <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aws Enable builds in the CI for aws cloud testing backport-7.17 Automated backport to the 7.17 branch with mergify backport-8.15 Automated backport to the 8.15 branch with mergify backport-v8.14.0 Automated backport with mergify bug Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team Team:obs-ds-hosted-services Label for the Observability Hosted Services team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants